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Abstract 


In  this  paper  rank  estimates,  called  WLS  rank  estimates  and  computed 
using  iteratively  reweighted  least  squares,  are  studied.  They  do  not  re¬ 
quire  the  estimation  of  auxilliary  scale  or  slope  parameters  nor  do  they 
require  numerical  search  techniques  to  minimize  a  convex  surface.  The 
price  is  a  small  asymptotic  efficiency  loss.  In  the  location  model,  begin 
ning  with  a  resistant  starting  value  such  as  the  median,  the  WLS  rank  esti 
mates  have  good  robustness  and  computational  properties.  The  WLS  rank 
estimate  is  also  extended  to  the  regression  model  and  an  example  is  given. 


1.  Introduce ion  and  Summary 


To  date  there  have  been  two  major  methods  used  to  construct  rank 
estimates  in  the  linear  model.  The  first  is  the  direct  minimization  of 
the  appropriate  dispersion  surface  proposed  by  Jaeckel  (1972).  The  mini¬ 
mization  is  equivalent  to  solving  a  set  of  non-linear  equations  and  can  be 
thought  of  as  an  extension  of  the  method  of  Hodges  and  Lehmann  (1963)  for 
defining  R-estimates  of  location.  These  estimates  then  have  the  same  asymp¬ 
totic  efficiency  properties  as  the  rank  estimates  in  the  location  case. 

In  the  location  model,  R-estimates  generally  satisfy  various  criteria  for 
robustness  such  as  bounded  influence  curves  and  positive  breakdown  values. 
See  Hettmansperger  and  Utts  (1977)  and  Huber  (1981)  for  details  of  robust 
estimation. 

The  second  method  of  construction  consists  in  developing  linearized 
versions  of  the  original  estimates  of  Jaeckel.  This  method  uses  the  asymp¬ 
totic  linearity  of  rank  test  statistics  developed  by  Jureckova  (1971). 
Jureckova  (1971)  and  Jaeckel  (1972)  used  the  linearity  to  derive  the  asymp¬ 
totic  distribution  theory  of  the  original  rank  estimates  but  did  not  use  the 
linearity  for  the  actual  construction  of  estimates. 

Kraft  and  van  Eeden  (1970),  (1972a),  (1972b),  were  the  first  to  develop 
linearized  rank  estimates.  Their  estimates  involve  a  starting  value  along 
with  a  scale  estimate.  In  general  these  estimates  are  not  quite  as  effi¬ 
cient  as  the  nonlinear  rank  estimates .  McKean  and  Hettmansperger  (1978) 
develop  linearized  rank  estimates  for  use  in  the  linear  model  which  have 
the  same  asymptotic  efficiency  as  the  nonlinear  versions.  These  linearized 
estimates  require  the  estimation  of  the  slope  of  the  linear  approximating^! 
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Due  to  the  time  required  to  locate  the  minimum  of  the  dispersion  surface 
and  determine  the  nonlinear  rank  estimate,  it  appears  that  some  alternative 
is  necessary.  The  practical  value  of  the  linearized  rank  estimates  is 
pointed  out  in  the  references  of  Kraft  and  van  Eeden  (1972b)  and  McKean  and 
Hettmansperger  (1978).  The  question  of  how  soon  the  asymptotic  linearity 
provides  an  adequate  approximation  still  remains.  Further,  the  effect  on 
efficiency  in  small  samples  of  estimating  scale  or  slope  is  not  fully  under¬ 
stood. 

The  M-estimate  approach  described  by  Huber  (1981)  provides  another  method 
of  robust  estimation.  As  in  the  rank  estimation  problem  M-estimation  re¬ 
quires  the  solution  of  nonlinear  equations.  Linearized  versions  of  M- 
estimates  have  been  studied  by  Bickel  (1975).  Andrews  (1974)  discussed 
some  of  the  computational  aspects  of  various  approaches  including  weighted 
least  squares.  Interestingly  he  pointed  out  that  iterating  to  convergence 
may  be  less  desirable  than  simply  using  a  fixed  number  of  iterations  from 
a  good  starting  value.  Finally  Holland  and  Welsch  (1977)  have  reported 
in  detail  on  the  iteratively  reweighted  least  squares  approach.  They  point 
out  that  the  linearized  version  is  theoretically  more  desirable  than  the 
weighted  least  squares;  however,  it  is  more  difficult  to  implement  because 
of  the  need  to  estimate  the  slope.  M-estimation  requires  the  estimation 
of  a  scale  to  make  the  resulting  location  and  regression  parameter  estimates 
scale  invariant.  In  the  Holland-Welsch  study  scale  was  estimated  just 
once  with  no  further  iterations  because  there  is  no  convergence  theory  when 
scale  is  iterated  along  with  the  location  estimates.  Their  Monte  Carlo 
results  indicate  that  for  small  samples  estimation  of  scale  has  a  strong 


effect  on  the  efficiency.  As  they  point  out,  "In  general,  the  effect  of 
estimating  the  scale  has  been  swept  under  the  rug  in  previous  studies  of 
robust  estimation  and  perhaps  these  results  will  bring  attention  to  the  fact 
that  it  should  be  more  carefully  considered."  The  situation  seems  to 
suggest  that  if  a  reasonably  good  estimate  of  scale  can  be  Incorporated  into 
a  weighted  least  squares  M- estimate  approach  the  result  would  be  a  compu¬ 
tationally  tractable  estimator  with  fairly  high  efficiency. 

In  this  study  we  study  weighted  least  squares  rank  estimates,  defined 
in  Section  2.  Unlike  M-estimates,  these  estimates  do  not  require  the  esti¬ 
mation  of  auxilliary  scale  functionals.  They  do  not  require  the  estimation 
of  scale  or  of  slope  as  in  the  case  of  linearized  R-estimates  nor  do  they 
require  numerical  search  techniques  to  minimize  a  convex  surface.  In  the 
location  model,  beginning  with  a  resistant  starting  value  such  as  the 
median,  they  have  good  efficiency  and  robustness  properties;  see  Sections 
3  and  4.  In  Section  5  we  extend  the  procedure  to  the  regression  model  and 


discuss  an  example. 


2.  The  Asymptotic  Distribution  of  WLS-Rank  Estimates 


For  a  given  random  sample  X.,  X„ ,  . ..,  X  from  a  continuous  symmetric 

i.  <  n 

distribution  G(x  -  6),  where  6  is  unknown,  an  R-estimate  of  6  is  a  value 
which  minimizes 

n  -t- 

S(0)  -  Z  a(R,  (9) ) |X.  -  9 j  (1) 

i-1  1 

where  0  _<  a(l)  _<  ...  _<  a(n)  are  constants,  usually  called  scores  and  R^+(0) 
is  the  rank  of  | X^  -  9 1  among  |x^  -  0  j ,  ...,  | X^  -  0 1 .  Below  we  will  show 
that  the  R-estimate  can  be  considered  as  a  weighted  least  squares 
estimate  with  weights  proportional  to  the  ranks  of  the  absolute  deviations. 

A 

First,  we  define,  equivalently,  an  R-estimate  9  of  9  as  the  solution 
of  the  following  equation 

n  + 

h(9)  *  I  a(R/(9))  sign(X  -  0)  *  0  (2) 

i*l 

Note  that  h(8)  is  a  nonincreasing  step  function  of  9;  see  Bauer  (1972).  The 
R-estimate  obtained  is  the  Hodges -Lehmann  (1963)  estimate  since  h(0)  is  a 
signed  rank  test  statistic  for  testing  Hq:  8  ■  0  vs.  :  0  >  0.  Except 

A  /N 

for  special  cases  like  a(i)  *  1  with  9  *  med  X^  or  a(i)  ■  i  with  9  - 
med  (X^  +  Xj)/2  solving  the  nonlinear  equation  (2)  for  0  is  usually  quite 
difficult.  For  example,  for  the  van  der  Waerden  or  normal  scores,  there 
is  no  simple  form  for  the  R-estimate. 

Hettmansperger  and  Utts  (1977)  in  writing  (2)  as 

Zw1(0)(Xi  -  0)  -  0  (3) 

with  Wl(9)  -  ja(Ri+(0))/|Xi  -  0|  if  t  0 

/  0  otherwise 


were  able  to  use  an  iteration  procedure  to  obtain  an  R-estimate  81 : 
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~  Iw1(9)Xi  ^  h(6o) 

9i  "  swi(eo)  "  9o  +  zWi(eo) 


where  9q  is  an  initial  estimate  of  9.  Applying  the  same  formula,  the 


k-step  estimate  9^  can  be  obtained 


0.  ■  9.  ,  + 
k  k-1 


iwi<ek_i) 


We  shall  call  them  weighted  least  squares  rank  estimates  or  in  short,  WLS- 
rank  estimates.  To  insure  convergence,  Utts  (1978)  proposed  an  algorithm 
which  combined  iteration  with  an  interval  halving  procedure.  She  then  proved 
the  convergence  of  the  k-step  WLS-rank  estimate  to  the  nonlinear  rank  esti- 


In  this  paper,  we  discuss  the  efficiency  and  robustness  properties  of 
the  WLS-rank  estimates.  Although  similar  to  the  Kraft  and  van  Eeden  estimate, 
in  general  the  one-step  WLS-rank  estimates  are  not  quite  as  efficient  as 
the  nonlinear  rank  estimates;  the  efficiency  for  the  k-step  estimate  con¬ 
verges  rapidly  to  that  of  the  nonlinear  estimate  as  k  increases.  Most  of 
the  WLS-rank  estimates  considered  have  bounded  influence.  Hence  the  weighted 
least  squares  rank  estimation  procedure  provides  a  computationally  feasible 
way  to  find  robust  R-estimates. 

The  proof  of  the  asymptotic  normality  of  the  WLS-rank  estimates  will 
closely  follow  that  of  Kraft  and  van  Eeden  (1970). 

Suppose  we  observe  a  sequence  X^,  ...,  X^  of  independent  random  variables 
such  that  Pr(X^  <_  x)  ■  G(x  -  9),  i  ■  1,  ...»  n.  Here  the  cumulative  distri¬ 
bution  function  G  is  unknown  but  is  assumed  to  be  a  member  of  the  class  ft 
of  distributions  with  absolutely  continuous,  symmetric  densities  with  positive 
and  finite  Fisher  information. 
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Let  F  £  !1  and  define 


A 


<MU> 


f  *  -I 

J~  (F  A(u)) 


0  <  u  <  1. 


Assume  that  satisfies  ^  where  is  nondecreasing  and  ^  is 


1  2  1  2 

nonincreasing,  /  <J)^  du  <  00  and  /  ^  du  <  and 

o  o 


4>f(l  -  u)  -  -<Mu) 


(6) 


Define  (u)  *  <j>f[(u  +  1)/2J  and  consider  a  rank  statistic  h^O)  of  the 
form 


,Ri+(0) 


hf(0)  =  2«j»f  (-•”•—)  sign  ~  9) 


(7) 


Let  0q  be  a  consistent  estimate  of  0,  then  the  one-step  WLS-rank  estimate 


is  defined  by 


0*0  + 


vv 


J1  uo  2wiC0o) 


(8) 


.  Ri+(8) 


where 


w±(e) 


»f  <-fTT> 

-  0 


|x±  -  0|  >  o 


(9) 


otherwise 


Following  Kraft  and  van  Eeden  (1970,  1972a)  we  will  suppose  for  convenience. 


that  the  initial  estimate  0q  is  asymptotically  equivalent  to  a  solution  of 


another  equation 


hs(9)  *  0 


(10) 


where  S  e  Q  is  a  fixed  distribution  function.  Denote  the  Fisher  information 

2, 


by  1(f)  ■  /<j>£  (u)du,  we  then  have 

-1/2.  D 


hf(0) 


N(0,  1(f)) 


(ID 


provided  1(F)  <  see  Hajek  and  Sidak  (1967,  p.  167). 
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Van  Eeden  (1972)  obtained  the  asymptotic  linearity  for  the  signed  rank 
statistic  analogous  to  that  of  Jureckova's  (1969)  result.  Using  a  special 
case  of  this  result,  Kraft  and  van  Eeden  (1970,  1972a)  derived  a  linearized 
rank  estimate  for  the  center  of  symmetry  in  the  one-sample  location  problem. 
We  shall  state  a  special  case  of  Theorem  3.3  in  the  paper  by  van  Eeden  (1972) 
as  a  lemma. 

Lemma.  Suppose  G  e  Q  and  <p^  satisfies  the  conditions  above  then 

p  (|9-e  I  <  c-i/2  -'1/2  I  k£«>  -  h£(0o>  + 

n-*»  1  o 1  — 

n(6  -  0o)I(f(g)  |  >  e)  -  0 


for  e  >  0,  c  >  0  a  constant,  and  I(f,g)  =  /<f>,(u)4>  (u)du. 

© 


1/2 ' 


From  this  lemma  we  conclude  that  if  n  0  is  bounded  in  probability  then 

1/2, 


hf(0)  -  hf(0)  -  n(0  -  0) I (f , g)  +  °  (n  '") 


(12) 


Notice  that  if  we  replace  h^(0)  =  0  by  the  linear  approximation  then  we  get 


hf  (0) 


0  =  0+  T /, — r.  Hence  a  one-step  linearized  rank  estimate  can  be  written  as 
nl(r,g; 

hf(0o) 


0.  *  0  +  = , ,  . 

1  o  nl(f,g) 


(13) 


where  I(f,g)  is  an  estimate  of  I(f,g). 

McKean  and  Hettmansperger  (1978)  estimated  I(f,g)  directly  and  showed 
that  there  was  no  asymptotic  efficiency  loss  relative  to  the  nonlinear  rank 
estimate  for  their  estimates. 

We  now  establish  the  asymptotic  normality  of  the  one-step  WLS-rank 
estimate  in  the  following  theorem. 
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Theorem  1.  If  G  £  ft,  <f> ^  and  (j»s  satisfy  the  conditions  above,  0£ 
1/2  A 

satisfies  (10)  and  n  (8  -  9)  is  bounded  in  probability,  then 

n1/2(§1  -  9)  -2— *  N(0,  V1(s,f,g)), 


(14) 


where 


and 


V--f  •*>  •  <l  -  7&f}>2  +  %!$(..«)  (1  -  + 

(15> 


J(f,g)  ”/- 


<J>.(G(x)) 


g(x)dx 


provided  J(f,g)  exists. 

Proof.  Without  loss  of  generality,  we  assume  that  0=0,  then  using  the 


lemma. 


W 


9.  =  9  +  -  . 

1  o  Sw1(0o) 


-1, 


1/2, 


n  x(h  (0)  +  o  (n  '  )) 

■  0O(1  -  - 2 - 

n  iZwi(0o)  n  J-Swi(0o) 

using  R.+(0  )  =  nG  +(|x  -  0  I),  and  since  G  +(x)  — — *•  G+(x) ,  0  — — ►  0, 
ion  lo  n  o 

n  '*'lw.(0  )  P  ,.^f^G  g(x)dx  =  J(f,g).  Apply  the  asymptotic 

1  °  f  R 

linearity  to  hg(0)  to  get 

a  hS(Q)  ,  ,  -1/2. 

8o  *  nl(s,g)  °p  n 


Hence 


1/2' 


’9-  '  ^rrr  U  -  *  j (TTjy  n'1/2hf<0)  + 


1  1(8 ,g) 


Op(l) 


(16) 
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Following  the  asymptotic  theory  of  rank  tests  (Hajek  and  Sidak,  1967,  p.  166), 
we  finally  have 

n172^  N(0,  vi(s,f,g)). 

Similarly  we  can  prove  the  following. 


Theorem  2.  Suppose  G  e  Q,  <j>,  and  <p„  satisfy  the  above  conditions, 

-  I  S 

A  i/2  * 

0  satisfies  (10)  such  that  n  (8  -  0)  is  bounded  in  probability  and 

o  o 


/v  /\ 


hf (0k-l) 


0k  9k-l  +  2wi(0k_1)  ’ 


k  =  1,  2 ,  ... 


(17) 


Then 


n1/2(0k  -  0)  -2—  N(0,  Vk(s,f,g)) 


(18) 


where 


V  (s  f  g)  =  — (i  -  — 

Vs,f>8'  l2T^7g)  u  J(f,g); 


+  21  (s,  f )  _  1  (f  >  g)  n k  ri  _  n  -  ^ S).) 2 1 

+  I(s,g)I(f,g)  U  J(f,g);  U  U  J(f,g);  1 


+  — i ^ —  [1  -  (i  -  1  8^  \kj2 

+  I2(f,g)  11  U  J(f,g)}  J 


(19) 


We  see  that  in  (19)  if  [1  -  I(f ,g)/J(f ,g) ] 


0  as  k  «>  then 


1/2* 

asyvar  n  9, 


I(f)/r(f,g)  ask - ►  », 


(20) 


i.e.,  the  fully  iterated  WLS-rank  estimate  has  the  same  asymptotic  variance 
as  the  nonlinear  rank  estimate.  The  effect  on  the  variance  due  to  the  weighted 
least  squares  method  represented  by  J(f,g),  and  the  effect  of  the  initial 
estimate,  represented  by  I(s,g)  and  I(s,f),  vanish  as  k  approaches  infinity. 
Further,  if  we  are  lucky  enough  to  choose  F  to  match  G,  the  underlying 


’•  >  - 
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distribution,  then  I(f,g)  =  1(f),  and 
asyvar  n1/,20k  A  1/I(f). 

That  is,  the  WLS-rank  estimate  is  approximately  asymptotically  efficient 
for  large  k. 

To  show  that  [1  -  I(f  ,g)/J(f  ,g)  ]  ■+■  0  is  equivalent  to  proving  that 

0  <  I(f,g)  <  2J(f,g)  .  (21) 


Theorem  3.  Let  G  e  (2  and  assume  that  is  increasing  on  (0>®),  then 
(21)  holds  if  either 

(1)  4>f(G(t))/t  is  nonincreasing  on  (0,®) 

or  (2)  <J>£(G(t))  is  concave  on  (O,00). 

00  00 

Proof.  First  note  that  I(f,g)  =  2/  <J>f(t)<j>  (t)dt  ■  2f  <j>'  (G(t) )g2 (t)dt 

rt  ®  l-V 


and  J(f,g)  *  2f  (<j>f(G(t) )/t)g(t)dt.  Since  <J»f(t)  =  — (1  -  t),  0  <_  t  £  1 
o 

and  g(-t)  =  g(t),  -®  <  t  <  ®,  condition  (1)  implies  that  on  (0,®) 


d_ 

dt 


-i>f(G(t)) 


(t)) 

t 


(G(t) )g(t)i  -  4>f(G(t)) 


<  0. 


Hence 


“  ,  00  <|>f(G(t)) 

/  (G(t))gz(t)dt  <  /  —  g(t)dt 

o  o 

00  <)>f(G(t)) 

<  2;  -  g(t)dt. 


Denote  (G (t ) )  by  k(t),  condition  (2)  implies  that  on  (0,®) 

k.(t)  <  fc(t)  -  MO)  .  kjtl  _ 


^  -  -nSii  ftinn  "i- 


a 


11 


Substitute  for  k(t)  and  simplify  to  obtain  the  result. 

Corollary.  Suppose  F  is  the  logistic  distribution  so  <j>^  is  the  Wilcoxon 
score  function  and  suppose  G  is  symmetric  and  concave  on  (0,»).  Then  (21) 
holds . 

The  WLS-rank  estimate  depends  on  the  initial  estimate  9q.  In  applica¬ 
tions  we  would  use  either  the  sample  median  or  the  sample  mean  as  the  initial 
estimate.  In  Section  3  we  discuss  the  stability  of  WLS-rank  estimates  using 
these  two  initial  estimates. 


3.  Asymptotic  Efficiency 


We  compute  the  asymptotic  efficiency  of  the  WLS-rank  estimates  relative 
to  the  maximum  likelihood  estimate  so  we  can  find  the  efficiency  loss  due 
to  the  weighted  least  squares  rank  estimation  procedure.  The  values  can 
be  compared  to  one,  the  optimal  value.  We  calculate  the  asymptotic  effi- 
ciency  for  several  different  combinations  of  the  initial  estimate,  the  score 
generating  function  and  the  underlying  distribution.  In  all  cases  the  one- 
step  WLS-rank  estimate  seems  quite  efficient  in  comparison  with  the  fully 
iterated  ones.  The  effect  of  the  initial  estimate  wears  off  quickly  as  k 
increases. 


-  Table  1  about  here  - 

We  also  calculate  the  asymptotic  efficiencies  for  the  WLS-rank  estimator 
when  the  underlying  distribution  is  a  contaminated  normal  G(x)  = 

(1  -  e)<I>(x)  +  e$(x/a),  and  use  the  Wilcoxon  scores.  The  table  suggests 
that  the  one-step  estimate  is  quite  efficient;  the  efficiency  converges 
rapidly  and  the  effect  of  the  initial  estimate  wears  off  quickly. 


-  Tables  2  and  3  about  here  - 


4 .  Robustness  Properties 


For  robustness  properties  we  show  here  the  stylized  sensitivity  curves 
(Andrews,  et  al.  1972,  Section  5E)  of  the  WLS-rank  estimates. 

The  sample  size  n  is  taken  to  be  20  and  the  sensitivity  curve  is 
stylized  by  taking  a  pseudo  sample  consisting  of  19  expected  normal  order 
statistics.  An  additional  point  x  is  then  added  as  the  20th  observation. 
The  value  of  the  estimator  evaluated  for  the  20  points  is  denoted  T(x). 

The  sensitivity  curve  is  defined  by  nT(x)  and  represents  the  change  in  T 
caused  by  adding  an  additional  point  x.  Note  that  for  the  19  centered, 
expected  order  statistics  the  value  of  T  is  zero.  For  T  »  X  we  have 

'X, 

nT(x)  =  x,  linear  and  unbounded,  while  for  T  =  X,  the  median,  the  sensiti¬ 
vity  curve  is  bounded  and  flat  outside  a  neighborhood  of  zero.  Sensitivity 
curves  provide  a  finite  sample  analog  of  the  influence  curve  discussed  by 
Hampel  (1974).  Unbounded  sensitivity  indicates  that  the  estimator  can  be 
unduly  influenced  by  a  small  part  of  the  data. 

Figures  1-4  show  the  stylized  sensitivity  curves  for  the  one  and 

five  step  WLS-rank  estimate  with  Wilcoxon  scores  and  initial  estimates 
—  a. 

x  and  x,  the  sample  mean  and  median,  respectively. 

If  we  begin  with  the  median  then  the  WLS-rank  estimate  has  bounded 
sensitivity  at  the  first  step.  If  we  begin  with  the  mean,  with  unbounded 
sensitivity,  the  WLS-rank  estimate  is  less  stable  at  the  first  step  but 
the  sensitivity  becomes  bounded  as  we  take  a  few  iterations.  This  nicely 
illustrates  how  the  effect  of  the  initial  estimate  wears  off  rapidly  with 
a  few  iterations. 


-  Figures  1-4  about  here  - 


14 


It  would  seem  preferable  to  use  the  median  as  a  starting  value;  however, 
in  more  complex  designs  we  may  only  have  least  squares  starts  available. 
Generally,  two  or  three  iterations  should  be  sufficient  with  a  resistant 
start  and  five  or  so  should  be  sufficient  with  a  least  squares  start  to 
stabilize  the  WLS-rank  estimate. 

In  Section  5F  of  the  Princeton  Robustness  Study,  stylized  breakdown 
bounds  are  defined  for  estimators.  For  a  random  sample  of  size  n,  j 
sample  points  are  taken  to  be  100,200,  ...,  j (100) .  The  remaining  n  -  j 
points  are  taken  to  be  the  n  -  j  expected  normal  order  statistics  from  a 
sample  of  size  n  -  j.  The  estimator  is  said  to  break  down  if  the  resulting 
estimate  is  greater  than  three.  Denote  by  m  the  largest  j  for  which  the 
estimate  is  less  than  three;  m/n  x  100%  is  then  recorded.  The  numbers  in 
Tables  4  and  5  are  those  for  the  WLS-rank  estimates.  Five  iterations  are 
used.  Also  included  are  the  breakdown  bounds  for  the  mean,  median,  and 
the  Hodges-Lehmann  estimate  so  the  values  can  readily  be  compared  with 
each  other. 


-  Tables  4  and  5  about  here  - 

From  these  tables  we  can  see  that  most  WLS-rank  estimates  have  larger 
breakdown  bounds  than  the  sample  mean  when  the  mean  is  used  as  the  initial 
estimate.  Using  the  median  as  the  initial  estimate  the  breakdown  bounds 
are  smaller  than  that  of  the  median,  but  they  are  pretty  close  if  we  only 
take  one  or  two  iterations.  This  again  shows  the  robustness  of  the  WLS- 


rank  estimates. 


5.  Extension  to  Regression  with  an  Example 

Adichie  (1967)  was  the  first  to  derive  estimates  of  the  regression 

coefficients  in  the  simple  linear  regression  model  using  the  Hodges-Lehmann 

(1963)  estimation  procedure.  The  methods  used  by  Jureckova  (1971)  and  Jaeckel 

(1972)  for  multiple  regression  can  be  considered  a  generalization  of  the 

methods  of  Hodges  and  Lehmann  (1963)  and  Adichie  (1967).  We  shall  use 

Jaeckel' s  measure  of  dispersion  of  the  residuals  to  derive  the  WLS-rank 

estimates  of  the  regression  parameters. 

Kraft  and  van  Eeden  (1972b)  proposed  both  linearized  rank  and  signed 

rank  estimates  for  the  linear  model  and  showed  that  under  regularity 

conditions,  the  estimates  are  asymptotically  normal.  We  will  apply  some 

of  their  results  in  the  discussion  below. 

Let  Y  be  an  n  x  1  vector  of  observations  3uch  that 

Y  *  IS  +  X8  +  e  (22) 

o  c 

where  X  -  [1,  X^]  is  a  known  n  x  (p  +  1)  matrix  of  full  column  rank,  0q 

is  the  intercept  parameter  and  0^  is  a  p  x  1  vector  of  regression  parameters. 

Assume  that  the  components  of  e  are  lid  and  each  has  a  distribution  G  e 

Let  R(Y,  -  X. '0  )  denote  the  rank  of  Y.  -  X.'0  among  Y.  -  X..  '0  ,  .... 

i  i  c  i  i  c  1  1  c 

Y  -  X  '0  .  Jaeckel' s  (1972)  estimate  of  0  is  a  value  0  which  minimizes 
A  n  c  f  c  c 

the  convex  function 

D(0  )  -  Za(R(Y.  -  X. '0  )) (Y.  -  X.'0  )  (23) 

c  i  l  c  l  l  c 

where  a(i)  -  4» (i/n  +  1)  may  be  generated  by  centered  versions  of  the  score 
functions  introduced  in  Section  2,  namely  <f>(u)  *  ({^(u)  -  where 
/<^(u)du.  Because  /<j)(u)du  ■  £a(i)  *  0  we  need  to  estimate  0q  separately. 

This  can  be  done  as  in  the  one-sample  location  problem  using  the  residuals. 


L6 


Differentiating  (23)  with  respect  to  0  ^ ,  we  obtain 

ZXija(R(Yi  -  Xi'0c))  -  0  j  -  1,  2,  ....  p.  (24) 

Using  the  same  technique  as  for  the  one-sample  problem,  equation  (24) 
can  be  written  in  matrix  form 


X1  d<e0’6c)Xiec  -  X1d(0o,0c)(Y  -  0ol) 

where  d(0  ,0  )  -  diag(W.  (0  ,0  ),  W , (0  ,0  ) . and 

oc  ioc  z  o  c  n  o  c 


(25) 


w.(a  .8 j 

i  o  c 


a(R(Y±  -  Xi'0c)) 
Yi-  6o  -  Y6c 


Y.  -  0  -  X, '6  i  0 
1  o  i  c 


Otherwise. 


(26) 


From  (25)  we  can  define  the  one-step  WLS-rank  estimate  of  0c  as  follows. 

A  (o)  *  (0) 

Let  0qv  and  0c  be  initial  estimates  of  0q  and  0c,  respectively,  then 
6C(1)  -  3C(0)  +  tX1'd(0o(O),81(O))x1]~1x1’d(8o(O),0c(O))x 
a(R(Y  -  8o(0)l  -  X10c(O)))  (27) 


Denote  i p(B)  *  (a(R(Y.  -  X  ’0  )),  ...,  a (R (Y  -  X  '0  )))’,  then 
c  i  x  c  n  n  c 


(1)  .  2  (0) 


ij/S  a  (0)\Y  i~^-v  t.i./a  (0)' 


8C%  '  +  [X1'd(0ow/,0CW/)X1]  iX1'if(Bcw) 


(28) 


which  is  similar  to  Bickel’s  (1975)  one-step  Huber  M-estimate  of  type  I 

and  Beaton  and  Tukey's  (1974)  weighted  least  squares  M-estimate.  We  shall 

~  (1) 

call  it  the  one-step  WLS-rank  estimate  of  type  I.  8Q  is  obtained  using 
the  one-sample  procedure.  We  would  generally  use  least  squares  estimates 
to  start  the  iteration.  Our  estimate  is  more  complicated  than  the  ones 
proposed  by  McKean  and  Hettmansperger  (1978)  and  Kraft  and  van  Eeden  (1972b) 


a 
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Nevertheless,  we  do  not  need  to  estimate  a  scale  parameter  as  they  do. 
To  develop  the  asymptotic  distribution,  we  rewrite  the  model  using  the 
centered  design  matrix. 


where  X. 


lc 


Y  -  lg  +  X-  g  +  e-  g*+X.g  +  e 
o  re  Ho  lcHc 

-  X, 


Xi 


Following  the  approach  of  Bickel  (1975),  define  the  one-step  WLS-rank 


~  n  n 

estimate  of  type  II,  *  as 


3  s  Q  (0)  ,  1  / v  'Y  ,j,  /Q  *(0)\ 

8c  8c  +  J(f7£7  (Xlc  Xlc}  Xlc*(8c*  > 


(29) 


A 

where  J(f,g)  is  a  consistent  estimate  of  J(f  g).  The  asymptotic  normality 

~  (i) 

of  0c*  '  under  regularity  conditions  then  can  be  established  following 

the  proof  of  Kraft  and  van  Eeden  (1972b)  (see  Cheng  (1979)).  We  state  the 
result  in  the  following  theorem. 

Theorem  4.  Under  the  regularity  conditions  of  Kraft  and  van  Eeden 
(1972b),  n^^(0  -  3c)  has,  asymptotically,  a  multivariate  normal 

distribution  with  mean  vector  0  and  covariance  matrix  given  by 


[• 


-iXS)  (1 

I(s,g)2  U 


L^.£))2  +  2l(s,f) _  q 

J(f,g)'  I(s,g)I(f,g)  u 


ULLjsI 

J(f,g) 


•)  + 


Mil  .... 

J(f,g)2 


■] 


-i. 


where  Z^c  is  positive  definite  and  n  X^c  X^c  -*■  Z^c 

The  asymptotic  distribution  of  the  WLS-rank  estimate  of  type  I  is  the 


same  as  that  for  type  II.  The  proof  requires  further  regularity  conditions 
on  the  weights  and  follows  along  the  lines  of  Bickel  (1975) ;  see  Cheng 
(1979)  for  details.  Hence  the  asymptotic  covariance  matrix  contains  the 
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factor  (15)  In  the  one-step  case.  It  can  be  shown  In  the  same  way  that 
factor  (19)  appears  for  the  k-step  case. 

As  an  example  we  consider  the  stack  loss  data  analyzed  by  Daniel  and 
Wood  (1971)  In  their  Chapter  5.  The  example  contains  21  observations  and 
3  parameters.  Daniel  and  Wood  studied  the  problem  extensively  using  least 
squares  and  found  4  outliers.  They  fitted  the  model  after  removing  the 
outliers.  Using  a  robust  regression  procedure,  Andrews  (1974)  obtained 
a  suitable  fit  without  deleting  the  outliers  as  did  Hettmansperger  and 
McKean  (1977)  using  rank  estimates. 

An  APL  program  was  written  to  calculate  the  WLS-rank  estimates.  The 
initial  estimates  were  least  squares  estimates,  and  negative  weights  were 
set  zero.  In  order  to  check  the  convergence  of  the  iteration  procedure, 
thirty  iterations  were  used.  In  the  following  table  we  show  the  k-step 
WLS-rank  estimates  using  sign  and  Wilcoxon  scores.  R-estimates  obtained 
by  Hettmansperger  and  McKean  (1977)  were  also  included.  The  estimates  were 
quite  close.  This  should  be  the  case  since  the  same  dispersion  function 
was  used,  only  the  techniques  used  to  obtain  the  estimates  were  different. 

Hence  the  weighted  least  squares  approach  achieves  an  acceptable  solu¬ 
tion  without  searching  a  convex  surface  which  may  be  costly  or  requiring 
the  estimation  of  an  auxllliary  scale  parameter. 


-  Table  6  about  here  - 
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Table  1.  Asymptotic  efficiencies  of  k-step  WLS-rank  estimates 


k 

1  2  3  4  5  10  30 


/v  ___ 

Sign  scores,  8o  *  X 


Normal 

.937 

.847 

.776 

.728 

.697 

.644 

.637 

Logistic 

.945 

.957 

.955 

.943 

.928 

.844 

.755 

D.E. 

.592 

.677 

.753 

.815 

.865 

.976 

1.0 

Wilcoxon  scores. 

A  __ 

0  “X 

o 

Normal 

.971 

.959 

.956 

.955 

.955 

.955 

.955 

Logistic 

.995 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

D.E. 

.689 

.735 

.747 

.75 

.75 

.751 

.751 

van  der  Waerden 

A 

scores,  9 
’  o 

=  X 

Normal 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

Logistic 

.956 

.955 

.955 

.955 

.955 

.955 

.955 

D.E. 

.628 

.636 

.637 

.637 

.637 

.637 

.637 

A  % 

Wilcoxon  scores,  9  *  X 

o 


Normal 

.919 

.950 

954 

.955 

.955 

.955 

.955 

Logistic 

.984 

.999 

1.0 

1.0 

1.0 

1.0 

1.0 

D.E. 

.853 

.780 

.759 

.753 

.752 

.751 

.751 

A 

der  Waerden 

scores.  9 

o 

-  X 

Normal 

1.0 

1.0 

i.e 

1.0 

1.0 

1.0 

1.0 

Logistic 

.947 

.955 

'.955 

.955 

.955 

.955 

.955 

D.E. 

.667 

.639 

.637 

.637 

.637 

.637 

.637 

•  .-jti 


Table  3-  Asymptotic  efficiencies  of  k-step  WLS-rank  estimates  for  contaminated 
normal  distributions  with  Wilcoxon  Scores. 


1 

2 

k 

3 

4 

5 

10 

30 

/s  w 

0  =  X 

o 


0  =  2 

.984 

.993 

.993 

.993 

.993 

.993 

.993 

e  =  .2  3 

.928 

.928 

.927 

.927 

.927 

.927 

.927 

4 

.864 

.858 

.857 

.857 

.857 

.857 

.857 

o=  2 

.973 

.985 

.986 

.986 

.986 

.986 

.986 

e  =  .1  3 

.950 

.958 

.958 

.958 

.958 

.958 

.958 

4 

.915 

.920 

.920 

.920 

.920 

.920 

.920 

a  =  2 

.963 

.977 

.979 

.979 

.979 

.979 

.979 

e  -  .05  3 

.956 

.969 

.970 

.970 

.970 

.970 

.970 

4 

.938 

.948 

.949 

.949 

.949 

.949 

.949 

i 


Table  4.  The  breakdown  bounds  for  k-step  WLS-rank  estimates  (initial 
estimate:  X,  scores:  Wilcoxon) . 


5 

Sample 

10 

Size 

20 

40 

k-0* 

(mean) 

0 

0 

0 

2.5 

1 

0 

10 

5 

7.5 

2 

0 

10 

10 

10 

Number  of 
Iterations 

3 

20 

10 

15 

15 

4 

20 

20 

15 

17.5 

5 

* 

20 

20 

20 

17.5 

00 

(Hodges-Lehmann 

estimate) 

20 

20 

25 

27.5 

Entries  in  these  rows  are  from  the  Princeton  Robustness  Study 
(Andrews,  et  al.,  1972,  Section  5F). 


Table  5.  The  breakdown  bounds  for  k-step  WLS-rank  estimates  (initial 
estimate:  2,  scores:  Wilcoxon) . 


5 

Sample 

10 

Size 

20 

40 

* 

k=0 

(median) 

40 

40 

45 

47.5 

1 

40 

40 

45 

42.5 

2 

40 

40 

40 

37.5 

3 

20 

30 

35 

37.5 

4 

20 

30 

35 

37.5 

5 

20 

30 

30 

32.5 

* 

OO 

20 

20 

25 

27.5 

(Hodges-Lehmann 

estimate) 


*Entries  in  these  rows  are  from  the  Princeton  Robustness  Study 
(Andrews,  et  al.,  1972,  Section  5F). 


Table  6.  Estimates  of  regression  coefficients 


Method 

a 

&1 

*2 

*3 

Least  squares 

-39.9 

.72 

1.30 

-.15 

Least  squares  w/o  outliers 

-37.6 

.80 

.58 

-.07 

Andrews 

-37.2 

.82 

.52 

i 

• 

o 

Hettmansperger  and  McKean:  Sign 

-39.7 

.83 

.58 

-.06 

Wilcoxon 

-39.95 

.80 

.90 

-.11 

WLS-rank,  Sign,  k  *  1 

-40.01 

.805 

.903 

-.114 

2 

-40.02 

.839 

.718 

-.096 

3 

-40.01 

.837 

.649 

-.078 

4 

-40.01 

.837 

.603 

-.067 

5 

-40.00 

.831 

.588 

-.060 

10 

-40.00 

.833 

.567 

-.057 

30 

-40.00 

.833 

.568 

-.057 

Wilcoxon,  k  *  1 

-40.29 

o 

00 

1.006 

-.138 

2 

-40.36 

.804 

.955 

-.126 

3 

-40.50 

.809 

.913 

-.118 

4 

-40.50 

.804 

.907 

-.113 

5 

-40.50 

.799 

.930 

-.115 

10 

-40.50 

.810 

.904 

-.116 

30 

-40.50 

.813 

.893 

-.116 

Figure  1.  Stylized  sensitivity  curve  for  1-step  WLS-rank  estimate 
(initial  estimate:  x,  scores:  Wilcoxon). 
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